摘要 :
A number of structural coverage criteria have been proposed to measure the adequacy of testing efforts. In the avionics and other critical systems domains, test suites satisfying structural coverage criteria are mandated by standa...
展开
A number of structural coverage criteria have been proposed to measure the adequacy of testing efforts. In the avionics and other critical systems domains, test suites satisfying structural coverage criteria are mandated by standards. With the advent of powerful automated test generation tools, it is tempting to simply generate test inputs to satisfy these structural coverage criteria. However, while techniques to produce coverage-providing tests are well established, the effectiveness of such approaches in terms of fault detection ability has not been adequately studied. In this work, we evaluate the effectiveness of test suites generated to satisfy four coverage criteria through counterexample-based test generation and a random generation approach—where tests are randomly generated until coverage is achieved—contrasted against purely random test suites of equal size. Our results yield three key conclusions. First, coverage criteria satisfaction alone can be a poor indication of fault finding effectiveness, with inconsistent results between the seven case examples (and random test suites of equal size often providing similar—or even higher—levels of fault finding). Second, the use of structural coverage as a supplement—rather than a target—for test generation can have a positive impact, with random test suites reduced to a coverage-providing subset detecting up to 13.5 percent more faults than test suites generated specifically to achieve coverage. Finally, Observable MC/DC, a criterion designed to account for program structure and the selection of the test oracle, can—in part—address the failings of traditional structural coverage criteria, allowing for the generation of test suites achieving higher levels of fault detection than random test suites of equal size. These observations point to risks inherent in the increase in test automation in critical systems, and the need for more research in how co- erage criteria, test generation approaches, the test oracle used, and system structure jointly influence test effectiveness.
收起
摘要 :
In this article, we summarize the deployment of the Air ForceWeather (AFW) HPC11
system at Oak Ridge National Laboratory (ORNL) including the process followed to
successfully complete acceptance testing of the system. HPC11 is t...
展开
In this article, we summarize the deployment of the Air ForceWeather (AFW) HPC11
system at Oak Ridge National Laboratory (ORNL) including the process followed to
successfully complete acceptance testing of the system. HPC11 is the first HPE/Cray
EX 3000 system that has been successfully released to its user community in a federal
facility. HPC11 consists of two identical 800-node supercomputers, Fawbush and
Miller, with access to two independent and identical lustre parallel file systems.HPC11
is equipped with Slingshot 10 interconnect technology and relies on the HPE Performance
Cluster Manager software for system configuration. ORNL has a clearly
defined acceptance testing process used to ensure that every new system deployed
can provide the necessary capabilities to support user workloads.We worked closely
with HPE and AFW to develop a set of tests that used the United Kingdom’s Meteorological
Office’s Unified Model and 4-dimensional variational data assimilation. We
also included benchmarks and applications from the Oak Ridge Leadership Computing
Facility portfolio to fully exercise the HPE/Cray programming environment and evaluate
the functionality and performance of the system. Acceptance testing of HPC11
required parallel execution of each element on Fawbush and Miller. In addition, careful
coordination was needed to ensure successful acceptance of the newly deployed lustre
file systems alongside the compute resources. In this work, we present test results
from specific system components and provide an overview of the issues identified,
challenges encountered, and the lessons learned along the way.
收起
摘要 :
Constructing good test cases is difficult and time-consuming, especially if the system under test is still under development and its exact behavior is not yet fixed. We propose a new approach to compute test strategies for reactiv...
展开
Constructing good test cases is difficult and time-consuming, especially if the system under test is still under development and its exact behavior is not yet fixed. We propose a new approach to compute test strategies for reactive systems from a given temporal logic specification using formal methods. The computed strategies are guaranteed to reveal certain simple faults in every realization of the specification and for every behavior of the uncontrollable part of the system's environment. The proposed approach supports different assumptions on occurrences of faults (ranging from a single transient fault to a persistent fault) and by default aims at unveiling the weakest one. We argue that such tests are also sensitive for more complex bugs. Since the specification may not define the system behavior completely, we use reactive synthesis algorithms with partial information. The computed strategies are adaptive test strategies that react to behavior at runtime. We work out the underlying theory of adaptive test strategy synthesis and present experiments for a safety-critical component of a real-world satellite system. We demonstrate that our approach can be applied to industrial specifications and that the synthesized test strategies are capable of detecting bugs that are hard to detect with random testing.
收起
摘要 :
The DoD has achieved success with recent automatic test equipment (ATE) families, as evidenced by the navy's consolidated automated support system (CASS) and the army's integrated family of test equipment (IFTE) programs. However,...
展开
The DoD has achieved success with recent automatic test equipment (ATE) families, as evidenced by the navy's consolidated automated support system (CASS) and the army's integrated family of test equipment (IFTE) programs. However, as these systems age, the increased requirements for technology insertion due to instrument obsolescence and the demands of advanced electronics are becoming evident. Recent advances in test technology promise to yield reduced total ownership costs (TOC) for ATE which can incorporate the new technology. The DoD automatic test system (ATS) executive agent office (EAO) objective is to significantly reduce total ownership cost. Several objectives have been identified including use of synthetic instruments, support for legacy test product sets (TPSs), and more efficient ways of developing TPSs. The NxTest software architecture will meet the objectives by providing an open systems approach to the system software. This will allow for the incorporation of commercial applications in the TPS development and execution environments and support current advances in test technology.
收起
摘要 :
A device has to function properly under all possible conditions: e.g., for all temperatures within a given range, for all possible humidity values within a given range, etc. Ideally, it would be nice to be able to test a device fo...
展开
A device has to function properly under all possible conditions: e.g., for all temperatures within a given range, for all possible humidity values within a given range, etc. Ideally, it would be nice to be able to test a device for all possible combinations of these parameters, but the number of such combinations is often so huge that such an exhaustive testing is not possible. Instead, it is reasonable to check the device for all possible values of each parameter, for each possible pairs of values of two parameters, and, in general, for all possible combinations of values of k parameters for some k. For n parameters, a straightforward testing design with this property contains O(n~k) • N~k experiments, where TV is the number of tested values of each parameter. We show that, by using a more sophisticated testing design, we can decrease the number of experiments to a much smaller number O(log~(k-1)(n)) • N~k.
收起
摘要 :
Eectromechanical systems built by Simulink or Ptolemy have been widely used in industry fields, such as autonomous systems and robotics. It is an urgent need to ensure the safety and security of those systems. Test case generation...
展开
Eectromechanical systems built by Simulink or Ptolemy have been widely used in industry fields, such as autonomous systems and robotics. It is an urgent need to ensure the safety and security of those systems. Test case generation technologies are widely used to ensure the safety and security. State-of-the-art testing tools employ model-checking techniques or search-based methods to generate test cases. Traditional search-based techniques based on Simulink simulation are plagued by problems such as low speed and high overhead. Traditional model-checking techniques such as symbolic execution have limited performance when dealing with nonlinear elements and complex loops. Recently, coverage guided fuzzing technologies are known to be effective for test case generation, due to their high efficiency and impressive effects over complex branches of loops. In this paper, we apply fuzzing methods to improve model testing and demonstrate the effectiveness. The fuzzing methods aim to cover more program branches by mutating valuable seeds. Inspired by this feature, we propose a novel integration technology SPsCGF, which leverages bounded model checking for symbolic execution to generate test cases as initial seeds and then conduct fuzzing based upon these worthy seeds. Over the evaluated benchmarks which consist of industrial cases, SPsCGF could achieve 8% to 38% higher model coverage and 3x-10x time efficiency compared with the state-of-the-art works.
收起
摘要 :
The problem of testing fault-tolerant redundant digital systems is investigated. To test redundant systems through normal voter outputs, independent control of the output of each replicated unit is required. In the past it was ass...
展开
The problem of testing fault-tolerant redundant digital systems is investigated. To test redundant systems through normal voter outputs, independent control of the output of each replicated unit is required. In the past it was assumed that independent control of the output of a replicated unit requires independent control of all of its inputs. The authors show that partial control of inputs is actually required. The critical input set problem, which is the problem of finding a set of inputs that need to be independently controlled, is formulated. Solutions are offered for different testing strategies, including exhaustive testing and deterministic testing, and for different levels of circuit description.
收起
摘要 :
In this paper we first argue the case for a system which can accurately reproduce sensed input or stimuli for fair evaluation of wireless sensor network applications. It is shown, with a simple example, that consistent input is cr...
展开
In this paper we first argue the case for a system which can accurately reproduce sensed input or stimuli for fair evaluation of wireless sensor network applications. It is shown, with a simple example, that consistent input is crucial in the evaluation of applications, and that the lack of such rigor may lead to wrong conclusions, and therefore a biased choice of what seems to be the best application. We present an architecture for a system that utilizes sensor nodes to provide the required stimuli and can exercise control over other sensor nodes that are executing the application under test. In our architecture, each sensor node executing the application under test is paired with a modified sensor node called the control node. We showcase a prototype implementation of the architecture using the MICAz hardware platform and TinyOS operating system software. Evaluation results for the prototype in a network setting are then presented. Our architecture, to the best of our knowledge, is the first to provide the benefits of both hardware-based and software-based approaches to enable controlled testing of sensor network applications. We also provide an optimization formulation for finding the least number of nodes through which control packets can be disseminated to every control node in the network.
收起
摘要 :
Automated test systems (ATS) are application-specific systems that help in enhancing reliability and productivity of testing activities in different stages of product development cycle and manufacturing. Different tests are necess...
展开
Automated test systems (ATS) are application-specific systems that help in enhancing reliability and productivity of testing activities in different stages of product development cycle and manufacturing. Different tests are necessary to anticipate and validate a product and its sub-systems performances along the development cycle. Moreover, in the production phase, quality control systems that perform up to 100% inspection may be present and the integration of test results with company's management systems may be required. In this paper, the design of ATS is analyzed under the light of design methodologies, which can basically be understood as procedures stating stages and intermediary results to discipline the development process of technical systems. Usually, design methodologies are presented as a set of prescribed actions and recommendations, kept with some degree of generality to allow their compatibility with a wide range of industrial products, based on different technology branches. In this paper, an adaptation of Pahl and Beitz's approach allowed the derivation of a reference model to ATS development. Proposed stages are explained considering ATS typical division, i.e.: physical system, hardware and software. The adapted methodology, in which prescriptions are made with an intermediary degree and directed to one particular kind of system, can make easier the design of a highly context-dependant kind of system, helping the design team to consider important aspects in decisions taken along the development process and keeping these decisions linked to the requirements defined in early stages. Through the use of the proposed methodology, shorter development cycles are expected, since less design loops tend occur in a higher-systematization environment.
收起